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INTRODUCTION 
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This  report  describes  new  concepts  for  two  related  problems: 

1.  Scaling  of  proficiency  measures 

2.  Setting  proficiency  standards  for  training 

It  is  believed  that  when  the  methodological  problems  for 
applying  these  concepts  are  solved,  military  training  researchers 
will  possess  more  powerful  tools  for  evaluating  training  programs 
and  generalizing  research  findings  from  various  specific  studies. 

THE  PROBLEM 


The  procedures  used  to  develop  proficiency  tests  for  military 
training  research  results  in  scores  which  have  definite  limitations. 
A review  of  the  procedures  used  in  developing  proficiency  tests  will 
clarify  the  nature  of  these  limitations: 

1.  The  tests  are  commonly  preceded  by  a Job  analysis  and 
represent  a sample  of  tasks  required  by  a particular  job.  This 
means  that  scores  are  specific  to  a given  job. 

2.  Scoring  procedures  for  a given  task  are  based  upon  various 
considerations,  such  as  judgments  of  the  seriousness  of  errors,  or 
ease  of  observation  of  behavior.  When  scores  for  the  tasks  are 
combined,  the  resulting  total  score  is  in  terms  of  units  which  are 


an  unknown  quantity  with  respect  to  such  major  classifications  of 
measures  as  rank-order,  equal-interval  or  ratio  scales  (10). 

3.  Norms  for  the  scores  are  based  upon  a specific  sample 
of  subjects.  F\irthermore,  these  norms  are  generally  expressed  in 
standard  scores.  This  means  that  the  score  represents  a crude 
approximation  to  an  individual's  rank  order  in  a given  sample. 

The  limitations  described  above  lead  to  serious  shortcomings, 
both  of  a practical  and  a research  natures 

1.  The  use  of  these  scores  in  training  research  renders 
practical  recommendations  difficult  to  make  in  certain  situations, 
■here  a less  expensive  training  nrogram  yields  measured  proficiency 
equal  to  or  greater  than  that  developed  by  a more  expensive  tral  n- 
ing  orogram,  then  there  is  little  difficulty  in  making  appropriate 
recommendations.  However,  when  a more  expensive  training  program 
also  produces  a higher  level  of  proficiency,  there  is  usually  little 
basis  upon  which  to  make  a decision. 

2.  The  relative  nature  of  the  norms  used  in  current 
proficiency  scores  provide  little  basis  for  defining  satisfactory 
performance.  One  of  the  important  uses  of  proficiency  tests  is  as 
quality  control  measures  for  the  graduates  of  training  programs, 
both  formal  and  on-the-job.  An  individual’s  eoore  on  a proficiency 
test  usually  ju'ovjrlea  m.  gnirtenoe  r»*  to  wliothar  lie  la  aa+.i  a factoid  ly 
trained. 
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3.  One  of  the  most  important  uses  of  measures  is  to 
provide  indices  whereby  direct  comparisons  may  be  made  of  objects 
and  situations  of  widely  varying  characteristics.  Examples  of 
such  indices  would  be:  Amount  of  learning  per  student-week; 
amount  of  proficiency  per  instructor;  or  the  amount  of  proficiency 
per  dollar  cost.  Such  indices  would  provide  important  tools  for 
training  managers  in  evaluating  the  efficiency  of  training  programs. 
These  indices  are  typically  formed  by  the  algebraic  process  of 
division,  although  other  processes  may  be  used  as  well.  The 
process  of  division  is  legitimately  performed  only  upon  ratio 
scales.  The  uncertainty  with  regard  to  the  basic  nature  of  the 
scales  used  in  current  proficiency  tests  means  that  such  indices 
cannot  be  formed.  Thus  a powerful  means  for  comparing  widely 
different  training  situations,  and  thus  increasing  the  generality 
of  research,  is  lost. 

U.  Since  the  dimensions  and  units  used  in  the  typical 
proficiency  test  are  specific  to  the  particular  research  study, 
it  is  not  possible  to  make  direct  comparisons  of  the  effects  of 
different  experimenters  and  relate  them  to  a common  basis.  It 
frequently  occurs  that  different  researchers  are  taking  common 
approaches  to  common  training  problems,  although  with  variations 
in  procedure.  Because  each  of  these  researchers  will  be  using  as 
evaluative  criteria  proficiency  tests  developed  for  particular  jobs. 
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and  yielding  scores  which  are  specific  to  the  particular  samples 
used,  there  can  be  no  common  basis  of  comparison. 

The  preceding  comments  point  out  the  need  for  proficiency 
measures  with  the  following  characteristics : 

1.  The  proficiency  measures  should  be  ratio  scales „ More 
mathematical  operations  can  be  performed  on  ratio  scales  than 
upon  other  kinds  of  scales.  With  ratio  scales  it  is  possible  to 
develop  new  and  useful  ridices  involving  various  ratios  for  com- 
parison of  degrees  of  proficiency. 

2.  Proficiency  measures  should  be  expressed  in  terms  which 
are  sufficiently  general  to  permit  comparisons  of  the  results  of 
widely  different  researchers.  In  other  words,  they  should  be 
capable  of  measuring  Proficiency  in  general,  rather  than  Proficiency- 
as-a-NIKE-AJAX-Piatoon-Leader,  for  example. 

3.  In  situations  in  which  the  need  for  practical  recommen- 
dations is  paramount,  proficiency  measures  should  permit  the  making 
of  a broader  range  of  recommendations  concerning  levels  of  profi- 
ciency in  relation  to  other  criteria,  particularly  criteria  which 
are  related  to  the  cost  of  training. 

The  purpose  of  this  report  is  to  propose  new  scales  fer  train- 
ing research  which  will  have  the  characteristics  described  above, 
and  to  discuss  problems  associated  with  setting  proficiency  standards. 


RATIONALE 


The  purpose  of  this  section  is  to  propose  a model  for  the 
determination  of  satisfactory  performance  based  generally  upon 
decision  theory.  The  method  here  proposed  will  be  called  Con- 
sequence Analysis  beoause  it  assumes  that  the  effect  of  an  error 
or  a lack  of  proficiency  can  be  determined  only  through  an  analysis 
of  the  consequences  of  making  the  error. 

The  general  rationale  under lying  Consequence  Analysis  is  as 
follows:  The  making  of  an  error  has  a consequence.  These  conse- 
quences may  be  different  depending  upon  the  situation  in  which  the 
error  is  made.  The  cost  of  each  consequence  can  be  estimated  or 
determined.  Finally,  the  exnected  cost  of  an  error  can  be  deter- 
mined by  multiplying  the  cost  of  each  consequence  by  the  probability 
of  the  occurrence  of  the  consequence  and  summing  over  consequences. 
The  end  result  of  this  analysis  will  be  the  expected  cost  of  the 
error. 


PROCEDURES 


The  initial  step  in  consequence  analysis  is  to  identify  all 
possible  errors  that  can  be  made  on  the  proficiency  test.  In  a 
multiple-choice  question  the  selection  of  each  mislead  on  an  item 


may  have  diffexent  consequences.  It  has  been  frequently  recognized 
that  some  wrong  answers  are  "wronger”  than  others.  In  a performance 
teat,  it  is  quite  likely  that  the  making  of  different  types  of  errors 
may  have  different  consequences. 

When  all  the  possible  errors  that  can  be  made  on  a proficiency 
test  have  been  identified,  it  is  necessary  to  identify  the  conse- 
quences of  the  errors.  At  this  stage  of  the  analysis  the  services 
of  a group  of  qualified  job  incumbents  would  seem  to  be  a necessity# 
It  is  important  to  keep  in  mind  that  a given  error  may  have  differ- 
ent consequences  under  different  conditions  and  that  the  same  con- 
seouences  may  have  a different  cost  under  different  conditions. 
Accordingly  it  is  important  to  catalog  not  only  the  consequences 
of  making  the  error,  but  the  conditions  under  which  these  conse- 
quences may  occur.  A cost  estimate  should  be  assigned  to  each 
combination  of  consequence  and  situation.  In  many  instances  these 
cost  estimates  can  be  made  quite  accurately  if  we  will  make  the 
effort  to  determine  them.  In  other  instances  it  may  be  necessary 
to  make  less  accurate  estimates. 

Each  combination  of  consequence  and  situation  has  in  addition 
to  a cost  figure,  a probability  of  occurrence.  Again  these  prob- 
abilities are  to  be  estimated  as  accurately  as  is  feasible.  The 
final  step  in  consequence  analysis  is  to  multiply  the  cost  of  each 
consequence-situation  combination  by  its  associated  probability. 
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When  these  problems  are  summed  the  result  is  an  estimate  of  the 
expected  cost  of  the  error.  Figure  1 shows  a format  which  can  be 
used  in  Consequence  Analysis. 


Figure  1 


FORMAT  FOR  CONSEQUENCE  ANALYSIS 


Error: 


Cost 


Expected  Cost 
Probability  (Cp) 


Consequence 
Situation 
Situati on 

Consequence 

Situation 

► f Situation 


Expected  cost  of  error  I Z Cp 
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It  is  well  recc:3iized  that  in  practical  application  the  model 
just  proposed  will  yield  results  only  as  accurate  as  the  estimates 
which  go  into  it.  It  seems  quite  reasonable  to  expect  that  the 
ingenuity  of  researchers  will  yield  improvements  in  methodology  Wiich 
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will  make  for  more  accurate  estimates  of  the  values  which  enter 
into  the  determination  of  the  expected  costs  of  an  error. 

I 

IMPLICATIONS 

I 

| With  further  effort  being  devoted  to  improving  the  accuracy 

of  the  various  estimates  used  in  Consequence  Analysis  and  in  in- 
creasing the  efficiency  of  its  application.  Consequence  Analysis 
may  be  expected  to  provide  a powerful  tool  for  determining  the 
answers  to  a number  of  important  practical  yuestions  which  train- 
ing researchers  frequently  face. 

The  principal  usefulness  of  Consequence  Analysis  is  that  it 
provides  a metric  for  lack  of  proficiency  which  can  be  balanced 
against  the  training  costs  required  to  overcome  this  lack. 

Psychologists  have  frequently  been  unable  to  justify  to 
research  consumers  or  themselves  the  adoption  of  training  methods 
which  increase  proficiency  but  at  the  same  time  cost  more  money. 
Consequence  Analysis,  by  providing  a monetary  yardstick,  may  be 
very  useful  in  converting  improved  proficiency  into  a saving  which 
can  be  sot  against  training  costs . 

The  problem  of  optimum  length  of  training  programs  also 
finds  an  evaluative  instrument  in  Consequence  Analysis.  It  is  con- 
ceivable that  in  some  instances,  reducing  the  length  of  a course  is 
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an  action  that  one  cannot  afford  because  it  costs  too  much  in  the 
consequences  of  errors. 


It  is  common  practice  to  graduate  an  individual  from  train- 
ing provided  he  performs  correctly  on  a test  sampling  the  content 
of  the  training  program.  The  use  of  Consequence  Analysis  in 
weighting  test  items  is  likely  to  result  in  graduates  who  have 
learned  those  skills  and  knowledges  whose  cost,  if  left  unlearned, 
is  of  major  importance. 

Along  similar  lines.  Consequence  Analysis  may  result  in 
important  gains  by  using  it  to  determine  the  cost  of  promotion 
from  one  sub-unit  of  training  to  the  next.  It  might  be  more 
profitable  to  have  an  individual  repeat  one  sub-unit  of  training 
than  to  promote  him  to  the  next  one. 

It  should  be  recognized  that  Consequence  Analysis  is  likely 
to  find  its  widest  application  in  those  jobs  in  which  the  tasks  In- 
volve well-defined  procedures.  Many  of  the  technical  tasks  per- 
formed by  military  personnel  are  of  this  nature.  It  is  from  con- 
sideration of  training  problems  for  these  individuals  that  Con- 
sequence Analysis  was  conceived. 

At  the  same  time,  it  should  be  possible  to  take  a more  positive 
approach.  If  exceptionally  meritorious  behavior  were  identified  by 
means  of  approaches  like  the  Critical  Incidents  technique.  Con- 
sequence Analysis  would  be  apolied  to  these  behaviors.  Instead  of 
costs,  savings  would  be  entered  into  the  analysis  tables. 
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DETERMINING  THE  LEVEL  OF  PROFICIENCY  DESIRED 
OF  HUMAN  COIiPONENTS  OF  MISSILE  SYSTEMS 


INTRODUCTION 


The  concept  of  a weapon  system  includes  not  only  the  equip- 
ment involved  in  the  system  but  the  human  components  as  well.  Both 
the  human  and  the  equipment  components  of  a weapon  system  must 
operate  at  a high  degree  of  reliability  in  order  for  the  weapon 
system  to  be  effective. 

Those  concerned  with  the  reliability  of  equipment  components, 
such  as  Lusser  (3),  have  developed  a set  of  concepts  and  procedures 
for  setting  reliability  standards.  Similar  concepts  and  procedures 
for  determining  proficiency  standards  of  the  human  component,  how- 
ever are  presently  lacking. 

The  purpose  of  this  section  is  to  consider  concepts  and  pro- 
cedures that  are  related  to  equipment  reliability  and  examine,  by 
analogy,  their  implications  for  human  proficiency.  It  is  felt  that 
the  application  to  the  human  component  of  requirements  similar  to 
those  of  the  equipment  component  of  a weapon  system  will  shed  new 
light  on  the  adequacy  of  our  present  notions  about  setting  proficiency 
standards  for  humans.  These  concepts  and  procedures  have  been  adapted 
from  Lusser  (3)« 
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The  reliability  of  equipment  components  is  defined  as  the 
probability  of  successful  functioning  under  operating  conditions. 

The  reliability  of  the  over-all  system  consists  of  the  product  of 
the  reliabilities  of  all  of  the  components  of  the  system. 

Ptotal  = P1P2P3 Pn 

When  human  operators  and  maintenance  personnel  are  included 
as  components  of  the  over-all  system,  it  is  quite  clear  that  there 
is  a serious  need  for  a high  degree  of  reliability  in  terms  of 
probability  of  correct  performance,  for  these  personnel. 

THE  SAFETY  MARGIN 

Safety  Margins  Applied  to  Components 

Lusser  proposes  that  the  average  strength  of  a component  be 
separated  from  the  maximum  severity  of  stress  to  which  that  com- 
ponent will  be  exposed  by  means  of  a safety  margin  which  is  measured 
in  standard  deviation  units.  The  maxiraim  stress  is  called  the  re- 
liability boundary,  and  the  safety  margin  is  then  the  difference 
between  the  reliability  boundary  and  the  mean  strength  of  the  com- 
ponent, measured  in  standard  deviation  units  which  are  based  upon 
measures  of  the  strength  of  the  component.  (Figure  2) 
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Figure  2 


SAFETY  MARGIN  FOR  EQUIPMENT  COMPONENTS 


Standard 

Deviation 

Units 


Mean  Strength 
of  Component 


Maximum 
Severity  of 
Stress 


Difference  Between  Human  and  Machine  Components 

In  order  to  apply  the  model  developed  by  Lusser  (3)  to  the 
determination  of  reliability  standards  for  human  components  of 
weapon  systems,  it  is  necessary  to  clearly  describe  the  differences 
between  human  and  machine  components.  1)  The  strength  of  machine 
components  is  measured  in  continuous  measures,  based  upon  their 
resistance  to  a given  farce.  Cta  the  other  hand,  the  human  equiva- 
lent of  strength  is  prcficiency,  which  is  usually  measured  by 
noncontinuous  variables  based  upon  the  oresence  or  absence  of  error. 
2)  Although  components  may  vary  among  each  other,  variability  from 
one  time  period  to  another  for  the  same  individual  must  be  considered 
for  the  human  as  well  as  differences  between  humans.  3)  For  machine 
components,  maximum  stress  can  be  specified  on  the  same  scale  and 
with  the  same  units  as  the  strength  of  the  component.  For  humans, 
the  equivelent  of  maximum  stress  cannot  be  so  quantitively  deter- 
mined. 

THE  HUMAN  ANALOGY 

In  order  to  carry  out  the  analogy  between  determination  of 
reliability  standards  for  machine  components  ard  a similar  deter- 
mination for  human  components  of  weapon  systems,  the  following  are 
needed:  1)  A definition  for  the  human  of  resistance  to  stress  and 
the  reliability  boundary.  2)  A continuous  scale  for  measuring 
resistance  to  stress.  3)  A procedure  for  at  least  ranking  stresses 


so  that  conditions  of  maximum  stress  can  be  determined, 
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Resistance  to  Stress 

In  a machine  component,  strength  is  measured  by  subjecting 
it  to  various  foroes.  For  the  human  component,  the  equivalent  of 
strength  would  be  correct  task  perf ormance . Any  environmental 
change  which  increases  errors  for  a given  individual  or  group  of 
individuals,  can  be  considered  stressful.  Therefore,  resistance 
to  errors  can  be  used  as  a measure  of  stress. 

l.ut.urally,  errors  are  not  equal  in  importance.  Some  errors 
have  minor  consequences.  Others  have  major  consequences.  The 
notion  cf  Consequence  Analysis  - of  determining  the  cost  of  the 
consequences  of  errors  should  be  considered  here. 

The  Reliability  Boundary 

For  machine  components,  as  stated  above,  the  reliability 
boundary  is  the  maximum  severity  of  stress  to  which  a oomponent 
will  be  subjected.  However,  for  machine  components,  the  stress  and 
strength  of  the  component  are  both  measured  in  the  same  unit.  This 
is  not  the  case  for  human  components  of  a system.  In  the  previous 
section,  the  strength  of  a human  cannonent.  has  been  defined  in 
terms  of  lack  of  errors  in  performance.  Similarly,  stress  has  been 
defined  as  an  environmental  change  which  increases  errors.  In  order 
to  avoid  a circularity  of  definitions,  a different  basis  must  be 
used  for  determining  the  reliability  boundary  for  the  human  compon- 
ents of  a weapon  system. 
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There  are  several  possibilities  for  defining  the  reliability 
boundary  in  cost  terms. 

Fin an  (1),  for  example,  makes  the  point  that  in  training  vb 
must  be  certain  that  the  proficiency  of  our  troops  exceeds  that  of 
our  potential  enemies.  While  this  is  undoubtedly  the  ideal,  there 
are  many  oroblems  involved  in  securing  accurate  data. 

Another  possibility  for  defining  the  reliability  boundary  is 
in  terms  of  the  cost  of  training.  If  the  cost  of  training  a person 
is  established,  then  the  cost  of  not  training  him  could  be  established 
by  Consequence  Analysis. 

Still  another  possibility  is  to  define  the  reliability  bouid- 
ary  as  the  cost  of  the  equipment  which  the  person  maintains  or 
operates.  Or,  in  some  instances,  the  cost  of  failure  to  accomplish 
the  unit  mission  might  be  appropriate. 

Further  work  should  explore  the  suitability  of  these  various 
bases  for  defining  the  reliability  boundary . Such  problems  as  the 
relative  stringency  of  the  various  boundaries  should  be  studied. 

A Continuous  Scale  for  Resistance  to  Stress 

In  order  to  determine  the  safety  margin,  strength  or  its 
equivalent  for  humans,  proficiency  must  be  expressed  in  continuous 
terms.  However,  an  error  is  a single  point  occurrence.  There  is 
then  e need  for  a procedure  for  converting  errors  into  a continuous 
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A useful  way  of  doing  this  would  be  to  use  Consequence 
Analyses  and  convert  the  errors  to  the  cost  of  their  consequences. 
The  continuous  scale  required  for  determining  safety  margins  would 
then  be  the  cost  of  the  consequences  of  making  errors. 

Procedures  for  Scaling  Stresses 

If  we  accept  the  number  of  errors  an  individual  makes  as 
an  inverse  measure  of  his  resistance  to  stress,  then  any  environ- 
mental condition  which  increases  errors  is  a stress.  The  number 
of  errors  made  on  a given  task  has  been  a matter  of  concern  to 
test  and  measurement  researchers  for  some  time.  One  of  the  standard 
items  of  information  one  obtains  on  a proficiency  test  is  the  pro- 
portion of  errors.  This  concern  with  errors  has  led  to  a consider- 
able amount  of  information  concerning  task  and  environmental 
characteristics  which  make  for  increase  in  errors.  Included  among 
these  factors  are  the  followingi  Degradation  of  stimulus  cues, 
increased  time  requirements,  fatigue,  the  performance  of  concurrent 
tasks,  and  negative  transfer,  to  mention  a few. 

The  importance  of  methods  for  scaling  stresses  becomes  more 
critical  at  the  stage  of  quality  control  through  proficiency  test- 
ing than  it  does  at  the  point  of  determining  the  reliability  stand- 
ards for  human  components  of  weapon  systems.  It  is  especially 
important  that  proficiency  measures  be  devised  which  will  test  the 
limits  of  human  performance  under  the  most  extreme  conditions  under 
which  the  weapon  system  will  be  employed. 
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Determining  the  Safety  Margin 

Having  established  the  reliability  boundary  as  the  cost  of  a 
missile,  how  many  standard  deviations  above  this  point  should  be 
the  performance  of  the  human  components  of  the  weapon  system,  when 
that  performance  is  measured  in  terms  of  the  consequences  of  errors? 
Lusser  points  out  that  there  is  no  fixed  procedure  for  determining 
the  safety  margin.  How  mnry standard  deviation  units  must  be  in- 
cluded in  the  safety  m.-rgin  will  depend  upon  the  presence  of  various 
contingencies,  each  with  its  own  particular  contribution  to  the 
over-all  safety  margin.  The  following  contingencies  are  adapted 
from  Insser's  discussion,  but  are  not  direct  translations  of  his 
list  of  contingencies „ The  particular  margins  contributed  by  each 
contingency  are  again  judged  in  their  relative  weight  by  the  frame 
of  reference  presented  by  Lusser !s  set  of  contingencies.  The  con- 
tingencies and  their  weights  are  listed  below: 

1.  Uncertainty  in  determining  service  conditions.  1 

2.  Uncertainty  in  methods  of  evaluation  of  personnel  1 

3«  Uncertainty  in  estimating  reliability  of 

supervision  2 

U.  Uncertainty  in  estimating  consequences  of  errors  2 

5.  Employment  in  low-risk  equipment,  which  can 

simply  be  repaired  and  set  right  again.  0 

6.  Employment  in  high-risk  equipment,  in  which  human 
error  can  make  for  complete  loss. 


5 


7.  Employment  in  ultra  high-risk  equipment,  in  10 

which  human  life  or  national  prestige  may  be 
affected. 

8.  Less  than  complete  sampling  of  tasks  in  proficiency  2 
tests. 

9.  Scatter  in  proficiency  test  scores.  1-3 

10.  Deviation  of  proficiency  test  conditions  from  1-3 

those  of  maximum  stress. 

The  total  safety  margin  is  determined  by  taking  the  square 
root  of  the  sum  of  the  squares  of  each  of  the  contingency  margins, 
for  example: 


Safety  Margin  - J/  l2+l2+22+22+$2+22+32+22 

= /T 2 : 7.3 

This  result  is  presented  graphically  in  Figure  3o 


IMPLICATIONS  FOR  TRAINING  RESEARCH 

The  reliability  of  a missile  system  is  the  product  of  the 
reliabilities  of  the  individual  components.  The  reliability  of 
the  human  components  should  be  equal  to  that  of  the  equipment  com- 
ponents if  missile  systems  or  weapon  systems  in  general  are  to  be 
reliable. 

This  analysis  of  the  problem  of  insuring  reliability  of  human 
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components  has  indicated  a number  of  instances  in  which  our  present 
procedures  and  expectations  regarding  proficiency  testing  are  highly 
inadequate.  These  instances  will  be  described  below. 

Proficiency  test  scores  are  generally  relative  to  the  group 
from  which  they  are  obtained.  They  are  either  made  relative  by 
means  of  standardization  procedures  such  as  percentile  ranking  or 
standard  scoring,  or  the  difficulty  of  the  items  is  adjusted  to 
this  group.  For  the  human  components  of  missile  systems,  this 
relativity  is  inadequate.  A meaningful  ratio  scale  is  required. 

It  is  proposed  that  scaling  errors  in  terms  of  the  cost  of  the 
consequences  of  the  errors  would  make  for  such  a scale. 

At  the  present  time  there  is  no  absolute  standard  against 
which  to  measure  the  adequacy  of  training.  The  adequacy  of  train- 
ing must  be  measured  by  comparison  of  one  training  program  with 
another.  The  use  of  the  Safety  hargin  for  the  evaluation  of  train- 
ing would  permit  the  direct  measurement  of  the  adequacy  of  training. 

Proficiency  testing  and  achievement  testing  make  much  use  of 
written  tests  because  they  are  relatively  inexpensive.  By  putting 
both  proficiency  and  school  achievement  testing  in  a context  of 
quality  control  of  components,  the  conclusion  is  reached  that: 

1.  Testing  must  occur  in  realistic  situations,  covering 


actual  tasks  to  be  performed  under  a wide  range  of 
conditions . 


2.  Attention  must  be  given  to  testing  the  limits  of  human 
performance,  especially  under  the  most  stressful  con- 
ditions expected  to  occur  in  the  actual  employment  of 
the  missile  system. 

This  analysis  has  indicated  the  need  for  new  standards  of 
rigor  in  developing  and  applying  proficiency  tests.  Since  present 
standards  of  training  adequacy  are  based  on  existing  concepts  of 
proficiency  measurement,  the  new  standards  may  be  expected  to  have 
considerable  impact  upcn  conceptions  of  what  constitutes  adequate 
training.  It  is  very  likely  that  present  standards  of  training 
adequacy  must  be  revised  upward  to  a considerable  extent. 

INFORMATION  MODELS 

Consequence  Analysis  as  a method  cf  scaling  proficiency  test 
scores  appears  to  have  its  greatest  potential  value  for  those 
situations  in  which  it  is  desired  to  de/elop  a basis  for  practical 
recommendations  concerning  training.  In  many  researches  the  matter 
of  practical  recommendations  is  not  as  important.  Another  possibil- 
ity for  scaling  proficiency  tests  which  possesses  both  the  require- 
ments of  a ratio  scale  and  independence  on  particular  units  of 
measurement  is  given  by  information  theory. 

In  the  following  discussion  of  the  application  of  information 
theory  models  to  proficiency  measurement,  technical  discussions  of 
formulae  will  be  avoided.  The  interested  reader  is  referred  to  the 
following  references  (2,  Jj,  5,  6,  7,  8,  9), 
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What  is  Information? 


Information  is  equivalent  to  uncertainty  or  varianc  8). 

If  a situation  is  highly  uncertain,  with  many  possible  alternatives 
that  might  occur,  we  obtain  more  information  by  observing  what 
actually  occurs  than  we  obtain  in  a situation  which  was  more  cer- 
tain and  with  fewer  possible  alternatives.  The  concept  of  variance 
is  similarly  related  to  the  amount  of  information.  A large  amount 
of  variance  means  that  there  is  uncertainty  about  what  will  actually 
occur.  Then  a particular  observation  will  yield  a large  amount 
of  information.  On  the  other  hand,  if  the  variance  is  small, 
making  a particular  observation  does  not  yield  as  much  informa- 
tion, since  there  are  fewer  possibilities  of  various  occurrences. 

The  unit  of  information  used  in  studies  in  the  information 
theory  framework  is  the  bit,  which  stands  for  binary  digit.  A bit 
is  that  amount  of  information  required  to  reduce  the  number  of 
alternatives  by  one-half.  The  bit  is  thus  independent  of  the 
particular  units  and  dimensions  used  to  measure  variance  or  un- 
certainty, and  thus  will  permit  the  comparison  of  results  obtained 
in  widely  different  experimental  situations. 

Several  different  models  based  on  information  theory  and 
measurement  have  been  used  in  psychology.  Two  of  these  apoear  to 
be  of  particular  value  for  training  research.  These  are  the  re- 
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• dundancy  model  and  the  transmission  model. 
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The  Redundancy  Model 

The  redundancy  model  ha3  been  applied  primarily  in  studies  of 
language  (6,  7 )•  The  maximum  amount  of  information  is  contained  in 
situations  where  all  alternatives  are  equally  likely  to  occur.  Thus, 
since  the  English  language  contains  primarily  26  letters  and  a space, 
the  maximum  amount  of  information  would  be  indicated  by  English  if 
the  occurrence  cf  letters  and  spaces  were  all  equally  likely.  Of 
course,  it  is  obvious  that  English  does  not  operate  this  way.  The 
letter  nq",for  instance,  never  occurs  except  .just  prior  to  the 
letter  !,u".  There  are  also  other  constraints  placed  upon  the  usage 
of  tho  symbols  of  the  English  alphabet  by  our  language  habits.  These 
constraints,  then,  mean  that  less  than  maximum  information  is  trans- 
mitted using  the  English  alphabet.  Accordingly,  the  alphabet  when 
used  to  express  language  is  redundant. 

One  way  of  looking  at  training  is  to  consider  it  a process 
for  bringing  responses  under  the  control  of  appropriate  stimuli. 

Thus,  the  range  of  possible  responses  to  a given  stimulus  is  re- 
duced, and  we  may  consider  that  the  relative  redundancy  of  the 
responses  to  these  stimuli  has  increased.  In  terms  of  information 
theory,  then,  the  purpose  of  training  is  to  increase  redundancy. 

One  of  the  major  advantages  of  the  redundancy  model  is  that 
there  are  already  available  certain  important  baselines.  Estimates 
of  the  amount  of  information  in  single  letters  and  words  in  connected 
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English  have  already  been  developed  (6).  Thus,  since  proficiency 
tests  are  samples  of  English  text,  the  techniques  of  computing  the 
amount  of  information  in  a proficiency  test  can  be  applied  and 
the  results  related  to  the  additional  estimates  of  redundancy  in 
English  text,.  The  nuntoer  of  different  responses  given  to  the 
dame  item  of  a proficiency  test  can  be  expected  to  be  less  for  a 
trained  group  of  subjects  than  for  an  untrained  group  of  subjects® 

Thus  these  results  when  measured  in  information  theory  terms  can 
be  used  as  means  for  computing  the  relative  amount  of  redundancy 
developed  by  training® 

Another  possibility  for  the  application  of  the  redundancy 
model  lies  in  the  currently  active  area  of  automated  instruction. 

One  of  the  presumably  desirable  characteristics  of  certain  types 
of  automatic  teadhing  procedures  is  that  the  content  should  be 
programmed  in  such  a way  that  the  student  never  makes  a mistake. 

Stated  another  way,  this  requ iroment  means  that  responses  to  stim- 

| 

uli  should  be  completely  redundant.  The  techniques  of  information 
measurement  can  be  apnlied  then  to  determining  the  degree  of  redun- 
dancy attained  in  a given  program  or  the  effect  of  different  pro- 
cedures in  approaching  this  high  level  of  redundancy® 

Another  possible  use  of  the  redundancy  model  is  in  research 
on  the  effectiveness  of  various  type."  of  job  aids.  Since  the  Job  aid 
car  be  interpreted  as  a means  of  reducing  the  variability  of  an-the- 
jcb  behavior,  the  redundancy  model  would  apply  here  also, 
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The  Transmission  Model 


The  transmission  model  considers  the  human  to  be  a channel 
for  transmitting  information.  There  is  input  in  the  form  of  stimuli. 
There  is  output  in  the  form  of  responses.  Information  is  trans- 
mitted through  the  human  to  the  extent  that  responses  are  highly 
correlated  with  stimuli*  Thus,  whereas  information  is  equivalent 
to  variance,  transmitted  information  is  equivalent  to  covariance 
or  correlation. 

As  the  amount  01  information  in  the  input  is  increased,  there 
is  normally  an  increase  in  the  amount  of  information  in  the  output . 
There  is  generally  a limit  to  the  amount  of  information  transmitted 
through  the  channel,  however,  and  eventually  a point  is  reached  at 
which  additional  amounts  of  information  in  the  input  does  not  re- 
sult in  additional  information  being  transmitted  through  the  channel. 
The  maximum  amount  of  information  which  can  be  transmitted  through 
the  communication  channel  is  called  the  channel  capacity 0 

Another  possible  way  of  looking  at  training  is  to  consider 
it  a process  for  increasing  the  channel  capacity  of  the  individual* 
Thus,  an  individual  with  greater  training  would  be  expected  to  be 
able  to  transmit  more  information  than  sin  individual  with  little 
training.  In  such  an  individual  there  would  be  a high  correlation 
between  the  stimulus  inputs  and  the  response  outputs. 

The  technique  of  data  analysis  for  the  transmission  model  are 


different  from  those  in  the  redundancy  model.  In  the  transmission 
model  the  analysis  techniques  are  more  complicated  than  in  the  re- 
dundancy model  (2,  U). 

However,  the  transmission  model  has  one  major  advantage  over 
the  redundancy  model.  This  is  that  various  stimulus  or  input  com- 
ponents can  be  analysed  in  a method  similar  to  the  way  the  effect 
of  different  variables  can  be  isolated  in  an  analysis  of  variance. 
Then  the  amount  of  transmitted  information  attributable  to  each 
component  of  the  stimulus  can  be  identified  (U). 

Most  of  the  kinds  of  analysis  which  can  be  performed  using  tho 
redundancy  model  can  also  be  performed  with  the  transmission  model. 
The  choice  must  be  based  upon  the  complexity  of  the  analysis  desired. 

The  use  of  the  transmission  model  in  prior  research  on  memory 
suggests  that  one  way  of  increasing  the  channel  capacity  of  the 
human  is  by  recoding  the  material  submitted  to  him  (8).  Thus,  the 
channel  capacity,  or  the  maximum  amount  of  learning,  can  be  increased 
by  recoding  information  into  a set  of  symbols,  each  symbol  of  which 
carries  more  information  with  it. 


SUMMARY 
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There  is  a definite  need  for  proficiency  measures  in  military 
training  which  have  the  characteristics  of  ratio  scales  with  widely 
general  dimensions<>  For  studies  with  practical  implications  these 
measures  also  need  to  be  criterion-relatedo 

Models  for  proficiency  measures  based  on  decision  theory 
and  information  theory  are  described  and  possible  uses  discussed. 
Consideration  is  given  to  the  problem  of  specifying  proficiency 
standards o , 
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